{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "toc": true
   },
   "source": [
    "<h1>Table of Contents<span class=\"tocSkip\"></span></h1>\n",
    "<div class=\"toc\" style=\"margin-top: 1em;\"><ul class=\"toc-item\"><li><span><a href=\"#Intro\" data-toc-modified-id=\"Intro-1\"><span class=\"toc-item-num\">1&nbsp;&nbsp;</span>Intro</a></span></li><li><span><a href=\"#Markov-Models\" data-toc-modified-id=\"Markov-Models-2\"><span class=\"toc-item-num\">2&nbsp;&nbsp;</span>Markov Models</a></span><ul class=\"toc-item\"><li><span><a href=\"#Transition-Matrix\" data-toc-modified-id=\"Transition-Matrix-2.1\"><span class=\"toc-item-num\">2.1&nbsp;&nbsp;</span>Transition Matrix</a></span></li><li><span><a href=\"#Latent-Variable-and-State-Space\" data-toc-modified-id=\"Latent-Variable-and-State-Space-2.2\"><span class=\"toc-item-num\">2.2&nbsp;&nbsp;</span>Latent Variable and State Space</a></span></li><li><span><a href=\"#Hidden-Markov-Model-(HMM)\" data-toc-modified-id=\"Hidden-Markov-Model-(HMM)-2.3\"><span class=\"toc-item-num\">2.3&nbsp;&nbsp;</span>Hidden Markov Model (HMM)</a></span></li></ul></li></ul></div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Intro"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Exploratory notebook related to introductory concepts and theory behind Markov Models. Includes toy examples implementation and relative visualization."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# basic libraries import\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "import seaborn as sns\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "%matplotlib notebook\n",
    "\n",
    "sns.set_context(\"paper\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Markov Models"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Also called Markov Chains, Markov Models are a common approach to treat sequential data. Sequential data can be seen as a set of dependent data points for which we consequently lose the independent and identically distributed (i.i.d.) assumption.\n",
    "\n",
    "We still assume stationarity and limited horizon, meaning that future predictions depend only on a fixed amount of most recent observations. \n",
    "For example if we assume that a data point depends only on the previous most recent observation, we are dealing with *first-order Markov chain*."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Transition Matrix"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A model can be represented via a transition matrix. Each entry represents the probability to move from current state (row value) to next state (column value). Each row must sum up to 1."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# possible states of our system\n",
    "states = ['sun', 'rain', 'cloud']\n",
    "\n",
    "# transition matrix\n",
    "data = [\n",
    "    [0.5, 0.1, 0.4],\n",
    "    [0.2, 0.4, 0.4],\n",
    "    [0.4, 0.3, 0.3]\n",
    "]\n",
    "transition_matrix = pd.DataFrame(data, index=states, columns=states)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "transition_matrix"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# each row must sum up to 1\n",
    "for state in states:\n",
    "    assert transition_matrix.loc[state].sum() == 1.0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "One can derive a transition matrix from historical data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "observations = np.array([\"sun\", \"sun\", \"cloud\", \"cloud\", \"rain\", \"cloud\", \"sun\", \"sun\", \"rain\", \"cloud\", \"cloud\", \"sun\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# count how many time each transition (from_state -> to_state) occured in the past\n",
    "transition_matrix = pd.DataFrame(np.zeros((len(states), len(states))), index=states, columns=states)\n",
    "for (from_state, to_state) in list(zip(observations, np.roll(observations, -1)))[:-1]: \n",
    "    transition_matrix.set_value(from_state, to_state, transition_matrix.loc[from_state, to_state]+1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "transition_matrix"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# normalize to obtain transition probabilities\n",
    "transition_matrix = transition_matrix.div(transition_matrix.sum(axis=1), axis=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "transition_matrix"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Latent Variable and State Space"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If we were to consider a second-order Markov chain we would have to specify the transition probabilities for each pair of states.\n",
    "The number of parameters then grows exponentially with the horizon/window size, so in order to overcome this limitation, *latent variables* have been introduced. This leads to *state space models*.\n",
    "\n",
    "* Hidden Markov Model (HMM) if latent variables are discrete\n",
    "* Linear Dynamical Systems (LDS) if latent variables are continuous (Gaussian)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Hidden Markov Model (HMM)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we saw before, for HMM we are dealing with discrete latent variables. This means that each state of the model can emit/generate different outputs, each of which has an associated emission probability. In other words each state has a probability distribution of possible outputs.\n",
    "Notice also that each output can belong to more that one state."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "TODO: concrete example of HMM"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:image-processing]",
   "language": "python",
   "name": "conda-env-image-processing-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.6"
  },
  "toc": {
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "toc_cell": true,
   "toc_position": {},
   "toc_section_display": "block",
   "toc_window_display": true
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
